Restricted Value Iteration: Theory and Algorithms

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Restricted Value Iteration: Theory and Algorithms

Value iteration is a popular algorithm for finding near optimal policies for POMDPs. It is inefficient due to the need to account for the entire belief space, which necessitates the solution of large numbers of linear programs. In this paper, we study value iteration restricted to belief subsets. We show that, together with properly chosen belief subsets, restricted value iteration yields near-...

متن کامل

Topological Value Iteration Algorithms

Value iteration is a powerful yet inefficient algorithm for Markov decision processes (MDPs) because it puts the majority of its effort into backing up the entire state space, which turns out to be unnecessary in many cases. In order to overcome this problem, many approaches have been proposed. Among them, ILAO* and variants of RTDP are state-of-the-art ones. These methods use reachability anal...

متن کامل

Polynomial Value Iteration Algorithms for Detrerminstic MDPs

Value iteration is a commonly used and em­ pirically competitive method in solving many Markov decision process problems. However, it is known that value iteration has only pseudo­ polynomial complexity in general. We estab­ lish a somewhat surprising polynomial bound for value iteration on deterministic Markov decision (DMDP) problems. We show that the basic value iteration procedure converges...

متن کامل

Polynomial Value Iteration Algorithms for Deterministic MDPs

Value iteration is a commonly used and empirically competitive method in solving many Markov decision process problems. However, it is known that value iteration has only pseudopolynomial complexity in general. We establish a somewhat surprising polynomial bound for value iteration on deterministic Markov decision (DMDP) problems. We show that the basic value iteration procedure converges to th...

متن کامل

Convergence of Indirect Adaptive Asynchronous Value Iteration Algorithms

Reinforcement Learning methods based on approximating dynamic programming (DP) are receiving increased attention due to their utility in forming reactive control policies for systems embedded in dynamic environments. Environments are usually modeled as controlled Markov processes, but when the environment model is not known a priori, adaptive methods are necessary. Adaptive control methods are ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Artificial Intelligence Research

سال: 2005

ISSN: 1076-9757

DOI: 10.1613/jair.1379